A Maximum-Entropy-Inspired Parser

نویسنده

  • Eugene Charniak
چکیده

We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trMned and tested on the previously established [5,9,10,15,17] "standard" sections of the Wall Street Journal treebank. This represents a 13% decrease in error rate over the best single-parser results on this corpus [9]. The major technical innovation is tire use of a "ma~ximum-entropy-inspired" model for conditioning and smoothing that let us successfully to test and combine many different conditioning events. We also present some partial results showing the effects of different conditioning information, including a surprising 2% improvement due to guessing the lexical head's pre-terminal before guessing the lexical head.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A maximum entropy semantic parser using word classes

This paper describes the parser that is used in the Sail Labs Conversational System, which is a spoken dialog system. This parser is a fully statistical, semantic parser. The probability model of the parser is based on the principle of maximum entropy. The maximum entropy framework allows to combine the available information in a fully automatic way, but the training of maximum entropy models i...

متن کامل

A Linear Observed Time Statistical Parser Based on Maximum Entropy Models

This paper presents a statistical parser for natural language that obtains a parsing accuracy—roughly 87% precision and 86% recall—which surpasses the best previously published results on the Wall St. Journal domain. The parser itself requires very little human intervention, since the information it uses to make parsing decisions is specified in a concise and simple manner, and is combined in a...

متن کامل

Using a maximum entropy-based tagger to improve a very fast vine parser

In this short paper, an off-the-shelf maximum entropy-based POS-tagger is used as a partial parser to improve the accuracy of an extremely fast linear time dependency parser that provides state-of-the-art results in multilingual unlabeled POS sequence parsing.

متن کامل

A Maximum-Entropy Partial Parser for Unrestricted Text

This paper describes a partial parser that assigns syntactic structures to sequences of partof-speech tags. The program uses the maximum entropy parameter estimation method, which allows a flexible combination of different knowledge sources: the hierarchical structure, parts of speech and phrasal categories. In effect, the parser goes beyond simple bracketing and recognises even fairly complex ...

متن کامل

A Maximum Entropy Chinese Character-Based Parser

The paper presents a maximum entropy Chinese character-based parser trained on the Chinese Treebank (“CTB” henceforth). Word-based parse trees in CTB are first converted into characterbased trees, where word-level part-ofspeech (POS) tags become constituent labels and character-level tags are derived from word-level POS tags. A maximum entropy parser is then trained on the character-based corpu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000